Learning Nondeterministic Classifiers

نویسندگان

Juan José del Coz

Jorge Díez

Antonio Bahamonde

چکیده

Nondeterministic classifiers are defined as those allowed to predict more than one class for some entries from an input space. Given that the true class should be included in predictions and the number of classes predicted should be as small as possible, these kind of classifiers can be considered as Information Retrieval (IR) procedures. In this paper, we propose a family of IR loss functions to measure the performance of nondeterministic learners. After discussing such measures, we derive an algorithm for learning optimal nondeterministic hypotheses. Given an entry from the input space, the algorithm requires the posterior probabilities to compute the subset of classes with the lowest expected loss. From a general point of view, nondeterministic classifiers provide an improvement in the proportion of predictions that include the true class compared to their deterministic counterparts; the price to be paid for this increase is usually a tiny proportion of predictions with more than one class. The paper includes an extensive experimental study using three deterministic learners to estimate posterior probabilities: a multiclass Support Vector Machine (SVM), a Logistic Regression, and a Naı̈ve Bayes. The data sets considered comprise both UCI multi-class learning tasks and microarray expressions of different kinds of cancer. We successfully compare nondeterministic classifiers with other alternative approaches. Additionally, we shall see how the quality of posterior probabilities (measured by the Brier score) determines the goodness of nondeterministic predictions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

دسته‌بندی پرسش‌ها با استفاده از ترکیب دسته‌بندها

Question answering systems are produced and developed to provide exact answers to the question posted in natural language. One of the most important parts of question answering systems is question classification. The purpose of question classification is predicting the kind of answer needed for the question in natural language. The literature works can be categorized as rule-based and learning...

متن کامل

Nondeterministic Classifier Performance Evaluation for Flow Based IP Switching

In modern IP networks, processing cost in network nodes is considered as a bottleneck. This problem is tackled with traffic based IP switching. The performance of traffic based IP switching depends heavily on flow classification. We demonstrate a method to evaluate the performance gains available with this technique with an optimal nondeterministic classifier giving a practical lower bound for ...

متن کامل

A Hybrid Framework for Building an Efficient Incremental Intrusion Detection System

In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...

متن کامل

Approximate Active Learning of Nondeterministic Input Output Transition Systems

Constructing a model of a system for model-based testing, simulation, or model checking can be cumbersome for existing, third party, or legacy components. Active automata learning, a form of black-box reverse engineering, and in particular Angluin’s L algorithm, support the automatic inference of a model from a System Under Learning (SUL), through observations and tests. Most of the algorithms ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of Machine Learning Research

دوره 10 شماره

صفحات -

تاریخ انتشار 2009

Learning Nondeterministic Classifiers

نویسندگان

چکیده

منابع مشابه

Application of ensemble learning techniques to model the atmospheric concentration of SO2

دسته‌بندی پرسش‌ها با استفاده از ترکیب دسته‌بندها

Nondeterministic Classifier Performance Evaluation for Flow Based IP Switching

A Hybrid Framework for Building an Efficient Incremental Intrusion Detection System

Approximate Active Learning of Nondeterministic Input Output Transition Systems

عنوان ژورنال:

اشتراک گذاری